22 research outputs found

    A framework for exploration and cleaning of environmental data : Tehran air quality data experience

    Get PDF
    Management and cleaning of large environmental monitored data sets is a specific challenge. In this article, the authors present a novel framework for exploring and cleaning large datasets. As a case study, we applied the method on air quality data of Tehran, Iran from 1996 to 2013. ; The framework consists of data acquisition [here, data of particulate matter with aerodynamic diameter ≀10 ”m (PM10)], development of databases, initial descriptive analyses, removing inconsistent data with plausibility range, and detection of missing pattern. Additionally, we developed a novel tool entitled spatiotemporal screening tool (SST), which considers both spatial and temporal nature of data in process of outlier detection. We also evaluated the effect of dust storm in outlier detection phase.; The raw mean concentration of PM10 before implementation of algorithms was 88.96 ”g/m3 for 1996-2013 in Tehran. After implementing the algorithms, in total, 5.7% of data points were recognized as unacceptable outliers, from which 69% data points were detected by SST and 1% data points were detected via dust storm algorithm. In addition, 29% of unacceptable outlier values were not in the PR.  The mean concentration of PM10 after implementation of algorithms was 88.41 ”g/m3. However, the standard deviation was significantly decreased from 90.86 ”g/m3 to 61.64 ”g/m3 after implementation of the algorithms. There was no distinguishable significant pattern according to hour, day, month, and year in missing data.; We developed a novel framework for cleaning of large environmental monitored data, which can identify hidden patterns. We also presented a complete picture of PM10 from 1996 to 2013 in Tehran. Finally, we propose implementation of our framework on large spatiotemporal databases, especially in developing countries

    An integrated MCDM approach to evaluate public transportation systems in Tehran

    Get PDF
    Public transportation is one of the most important systems in transportation, especially in big and crowded cities. As a result, evaluation of public transportation systems is a strategic decision-making problem for both private and public sections. In this paper, the problem of public transportation passengers in Tehran is addressed and their satisfaction levels are assessed by using passenger satisfaction survey. An integrated MCDM approach is proposed for evaluation of public transportation systems based on Delphi method, group analytic hierarchy process (GAHP) and preference ranking organization method for enrichment of evaluations (PROMETHEE). The proposed model provides more reliable and realistic results and introduces directions for future improvements of public transportation service quality. A sensitivity analysis is applied to investigate the influence of criteria weights on the decision making problem. As a conclusion, the most important public transportation systems in Tehran orderly are: metro, taxi, BRT, bus and van. Therefore, Tehran Municipality and policy makers should encourage and support the previously mentioned systems

    Additional file 1 of A novel dynamic Bayesian network approach for data mining and survival data analysis

    No full text
    Additional file 1: Supplementary file 1. The validation of structure learning. The posterior classification error of HC and Tabu algorithms for all nodes according to different score functions. Supplementary file 2. Conditional probability distribution of time stationary variables in the model. Supplementary figure 1. Conditional probability distribution of node stage given the different levels of its parents (Metastasis and TNM). Supplementary figure 2. Conditional probability distribution of node TNM given the different levels of its parent (Metastasis). Supplementary figure 3. Conditional probability distribution of node metastasis given the different levels of its parent (Smoking). Supplementary figure 4. Conditional probability distribution of node pathology given the different levels of its parent (Sex). Supplementary figure 5. Conditional probability distribution of node smoking given the different levels of its parent (Sex). Supplementary figure 6. Conditional probability distribution of node surgery given the different levels of its parent (Site)

    A Bayesian latent class extension of naive Bayesian classifier and its application to the classification of gastric cancer patients

    No full text
    Abstract Background The Naive Bayes (NB) classifier is a powerful supervised algorithm widely used in Machine Learning (ML). However, its effectiveness relies on a strict assumption of conditional independence, which is often violated in real-world scenarios. To address this limitation, various studies have explored extensions of NB that tackle the issue of non-conditional independence in the data. These approaches can be broadly categorized into two main categories: feature selection and structure expansion. In this particular study, we propose a novel approach to enhancing NB by introducing a latent variable as the parent of the attributes. We define this latent variable using a flexible technique called Bayesian Latent Class Analysis (BLCA). As a result, our final model combines the strengths of NB and BLCA, giving rise to what we refer to as NB-BLCA. By incorporating the latent variable, we aim to capture complex dependencies among the attributes and improve the overall performance of the classifier. Methods Both Expectation-Maximization (EM) algorithm and the Gibbs sampling approach were offered for parameter learning. A simulation study was conducted to evaluate the classification of the model in comparison with the ordinary NB model. In addition, real-world data related to 976 Gastric Cancer (GC) and 1189 Non-ulcer dyspepsia (NUD) patients was used to show the model's performance in an actual application. The validity of models was evaluated using the 10-fold cross-validation. Results The presented model was superior to ordinary NB in all the simulation scenarios according to higher classification sensitivity and specificity in test data. The NB-BLCA model using Gibbs sampling accuracy was 87.77 (95% CI: 84.87-90.29). This index was estimated at 77.22 (95% CI: 73.64-80.53) and 74.71 (95% CI: 71.02-78.15) for the NB-BLCA model using the EM algorithm and ordinary NB classifier, respectively. Conclusions When considering the modification of the NB classifier, incorporating a latent component into the model offers numerous advantages, particularly within medical and health-related contexts. By doing so, the researchers can bypass the extensive search algorithm and structure learning required in the local learning and structure extension approach. The inclusion of latent class variables allows for the integration of all attributes during model construction. Consequently, the NB-BLCA model serves as a suitable alternative to conventional NB classifiers when the assumption of independence is violated, especially in domains pertaining to health and medicine

    The 15-year national trends of endocrine cancers incidence among Iranian men and women; 2005–2020

    No full text
    Abstract Cancer is one of the important health problems in Iran, which is considered as the third cause of death. Endocrine cancers are rare but mostly curable. Thyroid cancer, the most common endocrine tumors, includes about one percent of malignant cancer. In this study, we examined the 15-year national trend of endocrine cancer incidence in Iranian men and women. The data in each province were evaluated based on age, gender, and cancer type according to International Classification of Disease Codes version 10 (ICD-10) from 2005 to 2020 in Iran. All data were obtained from the reports of the Statistics Center of Iran (SCI), 6 phases of the step-by-step approach to monitoring the risk factors of chronic diseases over 18 years old (STEPs), and 3 periods of the CASPIAN study (survey of non-communicable diseases in childhood and adolescence). Statistical analyzes and graph generation were done using R statistical software. Poisson regression with mixed effects was used for data modeling and incidence rate estimation. The incidence of thyroid gland malignancy is higher in women than in men. On the other hand, the incidence of adrenal gland cancer is slightly higher in men than in women. The same pattern is observed for other endocrine neoplasms and related structures. The incidence rate of these types of cancers has generally increased from 2005 to 2020 in Iran. This increase is more in women than in men. In addition, in the middle of the country, there is a strong region in terms of the occurrence of these types of cancers. The incidence rate in these provinces is relatively higher for both sexes and all studied periods. We conducted a study to observe the changing trends for various types of endocrine cancers over 15 years in men and women. Considering the increasing trend of thyroid cancers in Iran, therefore, creating essential policies for the management of these types of cancers for prevention, rapid diagnosis, and, timely treatment is particularly important

    Hepatocellular carcinoma incidence at national and provincial levels in Iran from 2000 to 2016: A meta-regression analysis.

    Get PDF
    BackgroundThe incidence of Hepatocellular carcinoma (HCC), the most common primary liver cancer with high mortality, is undergoing global change due to evolving risk factor profiles. We aimed to describe the epidemiologic incidence of HCC in Iran by sex, age, and geographical distribution from 2000 to 2016.MethodsWe used the Iran Cancer Registry to extract cancer incidence data and applied several statistical procedures to overcome the dataset's incompleteness and misclassifications. Using Spatio-temporal and random intercept mixed effect models, we imputed missing values for cancer incidence by sex, age, province, and year. Besides, we addressed case duplicates and geographical misalignments in the data.ResultsAge-standardized incidence rate (ASIR) increased 1.17 times from 0.57 (95% UI: 0.37-0.78) per 100,000 population in 2000 to 0.67 (0.50-0.85) in 2016. It had a 21.8% total percentage change increase during this time, with a 1.28 annual percentage change in both sexes. Male to female ASIR ratio was 1.51 in 2000 and 1.57 in 2016. Overall, after the age of 50 years, HCC incidence increased dramatically with age and increased from 1.19 (0.98-1.40) in the 50-55 age group to 6.65 (5.45-7.78) in the >85 age group. The geographical distribution of this cancer was higher in the central, southern, and southwestern regions of Iran.ConclusionThe HCC incidence rate increased from 2000 to 2016, with a more significant increase in subgroups such as men, individuals over 50 years of age, and the central, southern, and southwestern regions of the country. We recommend health planners and policymakers to adopt more preventive and screening strategies for high-risk populations and provinces in Iran

    National and sub-national exposure to ambient fine particulate matter (PM2.5) and its attributable burden of disease in Iran from 1990 to 2016

    No full text
    Ambient particulate matter is a public health concern. We aimed (1) to estimate national and provincial long-term exposure of Iranians to ambient particulate matter (PM) < 2.5â€ŻÎŒm (PM; 2.5; ) from 1990 to 2016, and (2) to estimate the national and provincial burden of disease attributable to PM; 2.5; in Iran. We used all available ground measurements of PM < 10â€ŻÎŒm (PM; 10; ) (used to estimate PM; 2.5; ) from 91 monitoring stations. We estimated the annual mean exposure to PM; 2.5; for all Iranian population from 1990 to 2016 through a multi-stage modeling process. By applying comparative risk assessment methodology and using life table for years of life lost (YLL), we estimated the mortality and YLL attributable to PM; 2.5; for five outcomes. The predicted provincial annual mean PM; 2.5; concentrations range was between 21.7â€ŻÎŒg/m; 3; (UI: 19.03-24.9) and 35.4â€ŻÎŒg/m; 3; (UI: 31.4-39.4) from 1990 to 2016. We estimated in 2016, about 41,000 deaths (95% uncertainty interval [UI] 35634, 47014) and about 3,000,000 YLL (95% UI: 2632101, 3389342) attributable to the long-term exposure to PM; 2.5; in Iran. Ischemic heart disease was the leading cause of mortality by 31,363 deaths (95% UI: 27520, 35258), followed by stroke (7012 (5999, 8062) deaths), lower respiratory infection (1210 (912, 1519) deaths), chronic obstructive pulmonary disease (1019 (715, 1328) deaths), and lung cancer (668 (489, 848) deaths). In 2016, about 43% of all PM; 2.5; related mortality in Iran was, respectively, in the following provinces: Tehran (12.6%), Isfahan (9.3%), Khorasan Razavi (8.0%), Fars (6.5%), and Khozestan (6.4%). In summary, we found that the majority of Iranians were exposed to the levels of ambient particulate matter exceeding the WHO guidelines from 1990 to 2016. Further, we found that there was an increasing trend of total mortality attributed to PM; 2.5; in Iran from 1990 to 2016 where the slope was higher in western provinces

    Comparison of the Safety and Immunogenicity of FAKHRAVAC and BBIBP-CorV Vaccines when Administrated as Booster Dose: A Parallel Two Arms, Randomized, Double Blind Clinical Trial

    No full text
    Purpose: This study was completed to assess the immunogenicity and safety of the FAKHRAVAC and BBIBP-CorV vaccines as a booster dose in the population with a history of receiving two doses of BBIBP-CorV vaccine. Methods: In this double-blind, parallel clinical trial, we randomly assigned healthy adults with a history of receiving two doses of the BBIBP-CorV vaccine, who then received either the FAKHRAVAC or BBIBP-CorV vaccine as a booster dose. The trial is registered in the Iranian Registry of Clinical Trial document depository (Code: IRCT20210206050259N4). Results: The outcomes that were monitored in this study were serum neutralizing antibody (Nab) activity, immunoglobulin G (IgG) level, local and systemic adverse reactions, serious adverse events, suspected unexpected serious adverse reactions, and medically attended adverse events. After administering vaccines to 435 participants, the most frequent local and systemic adverse reactions were tenderness and nausea in 23.7% and 1.4% of cases, respectively. All adverse events were mild, occurred at a similar incidence in the two groups, and were resolved within a few days. Conclusions: On the 14th day after the booster dose injection, the seroconversion rate (i.e., four-fold increase) of Nabs for seronegative participants were 87% and 84.6% in the FAKHRAVAC® and BBIBP-CorV groups, respectively. This study shows that the FAKHRAVAC® vaccine, as a booster dose, has a similar function to the BBIBP-CorV vaccine in terms of increasing the titer of virus-neutralizing antibodies, the amount of specific antibodies, and safety

    Estimating national dioxins and furans emissions, major sources, intake doses, and temporal trends in Iran from 1990-2010

    No full text
    Polychlorinated dibenzo-p-dioxins (PCDD) and dibenzofurans (PCDFs) are highly toxic persistent organic pollutants (POPs), which can cause various health outcomes, such as cancer. As a part of the National and Sub-national Burden of Disease Study (NASBOD), we aimed to estimate dioxins and furans national emissions, identify their main sources, estimate daily intake doses, and assess their trend from 1990-2010 in Iran.; The Toolkit for Identification and Quantification of Releases of Dioxins, Furans and Other Unintentional POPs, which is developed by the United Nations Environment Programme (UNEP 2013), was used to estimate the emissions of PCDD/PCDFs from several sources into the air, water, land, residue, and other products. The daily intake doses were estimated using a linear regression of estimated emissions by UNEP Toolkit and average intake doses in other countries. Finally, the trend of PCDD/PCDFs emissions and daily intake doses were explored from 1990-2010.; The total emissions were estimated as 960 g Toxic Equivalents (g TEQ) for 1990 and 1957 g TEQ for 2010 (18.2 and 26.8 g TEQ per million capita, respectively). The estimations suggest that albeit contribution of open burning to PCDD/PCDFs emissions has been declining from 1990 to 2010, it remained the major source of emissions in Iran contributing to about 45.8% out of total emissions in 1990 to 35.7% in 2010. We further found that PCDD/PCDFs are mostly emitted into the ambient air, followed by residue, land, products, and water. The daily intake doses were estimated to be 3.1 and 5.4 pg TEQ/kg bw/day for 1990 and 2010, respectively. We estimated an increasing trend for PCDD/PCDFs emissions and intake doses in Iran from 1990-2010.; The high levels of emissions, intake doses, and their increasing trend in Iran may pose a substantial health risk to the Iranian population. Further studies with more rigorous methods are recommended but this should not circumvent taking appropriate policy actions against these pollutants. Currently, Iran has no standard for dioxins and furans. Adaptation of the World Health Organization recommended guidelines might be an appropriate starting point to control dioxins and furans emissions
    corecore